Methods and apparatus for providing privacy-preserving global customization

ABSTRACT

Techniques and infrastructure are provided for supporting global customization. The invention enables persona profiles of user information to be maintained, and such persona profiles to be accessed by merchants. Via the persona abstraction, users control what information is grouped into a persona profile, and can selectively enable a merchant to read one of these profiles. The infrastructure of the invention employs a persona server that assists users in managing their personae. The infrastructure of the invention separates this from the profile databases at which persona profile information is stored, to eliminate any single point at which different persona profiles can be tied to the same user. Since merchants also have privacy concerns, the infrastructure of the invention provides a data protection model based on tainting, by which merchants can limit how the information they contribute can be exposed.

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims priority to the U.S. provisional patentapplication identified by Serial No. 60/211,164, filed on Jun. 13, 2001,and entitled “Privacy-Preserving Global Customization,” the disclosureof which is incorporated by reference herein.

FIELD OF THE INVENTION

[0002] The present invention relates to global customization of networkcontent and, more particularly, to global customization of networkcontent with privacy mechanisms such that a user may control whatinformation a merchant can learn about the user's activity at othermerchants, and a merchant may control what information is revealed tocompeting merchants.

BACKGROUND OF THE INVENTION

[0003] With the increasing user acceptance of performing purchasingtransactions over a data network, such as the World Wide Web(hereinafter “web”) or the Internet, merchants who host web sites atwhich users may purchase (or, at least, learn about) their products havean obvious financial interest in continuously attempting to improve theuser's experience. Mass customization refers to the creation of acustomized experience for online buyers by using technology thatresponds to their individual requirements and interests, see, e.g., J.Nelson, “Mass-Customization Marketing: Maximizing Value of Customers,”IDC Bulletin #17726, December 1998, the disclosure of which isincorporated by reference herein. “Customization” is sometimes alsocalled “personalization,” though personalization also conveys themeaning of web content that the user can explicitly configure. Forexample, a user might create a personalized web page at a site bytelling the site which stock quotes to display whenever the user visits.Here, we are primarily concerned with content that a site predicts theuser will like based on information inferred about the user, rather thanby explicit user instruction. Customization typically employs datamining and/or collaborative filtering to predict content that is likelyto be of interest to that visitor, and presentation of customizedcontent to the visitor at opportune moments. Customization can beparticularly effective when the user identifies himself or herselfexplicitly to the web site. In this case, customization can be much more“accurate,” in the sense that the site can employ the specific user'spast browsing and purchasing history at that site to predict whatcontent will be most effective for this user.

[0004] Global customization, by which a user's web history is sharedacross many merchant sites, is practiced today in several forms. Apredominant form of such global customization is “ad networks” such asDoubleClick™. In this form, information about a visitor's activities ata merchant site is passed to DoubleClick™ via image hypertext links inthe merchant's page. In response to these requests, DoubleClick™ returnsbanner advertisements customized to these activities. This customizationis “global” in that this information is collected into a profile for theuser (or more precisely, the browser) that is used to customize ads forthe same user on his or her future visits to DoubleClick™-enabled sites.

[0005] Recently, even more ambitious sharing of consumer web activityhas been developed by companies such as Angara™ and I-behavior™ (or NetPerception™). Both companies profile users, Angara using an opt-outapproach and I-behavior using an opt-in approach, and provide targetedinformation to merchants about a user for the purposes of customization.However, none of these existing approaches provide support for users andmerchants to specify policies that limit who can obtain information theycontribute.

[0006] Further, electronic wallets, such as the Microsoft Passport™ andthe Java Wallet™, may offer possibilities for global customization.Wallets vary with respect to what information they retain about useractivities, and to what extent they share this information withparticipating merchants. However, to the extent that they do retaininformation (for example, they often retain receipts for purchases),such wallets pose a privacy risk to both users and merchants. From theuser perspective, these wallets hold identifying information for theuser in conjunction with any behavioral information, and, therefore,stored behavioral profiles are not anonymous. Moreover, to the extentthat behavioral information is conveyed to merchants, merchants areunable to specify data protection policies about how information theycontribute is to be shared with others. The above-mentioned privacyrisks have been cited as a major tension between wallet vendors and bothonline merchants and users; see, e.g., K. Cassar et al., “DigitalWallets, Pursuing Dual Wallet Strategy Before Leverage is Lost,” JupiterStrategic Planning Services/DCS99-14, February 1999, the disclosure ofwhich is incorporated by reference herein.

[0007] Still further, pseudonymous e-mail addresses, or “nyms,” areknown to be used in e-mail applications, see, e.g., D. Maziéres et al.,“The design, implementation, and operation of an email pseudonymserver,” Proceedings of the 5th ACM Conference on Computer andCommunication Security, pages 27-36, November 1998; and I. Goldberg etal., “Freedom Network 1.0 architecture,” November 1999, the disclosuresof which are incorporated by reference herein. Users post to newsgroupsor send emails under a nym in a way that recipients may not easily beable to correlate multiple nyms as being the same user. However, nyms donot provide mechanisms and support for users and merchants to specifypolicies that limit who can obtain information they contribute such thatglobal customization of network content may be performed in asufficiently privacy-preserving manner.

SUMMARY OF THE INVENTION

[0008] The present invention provides techniques for globalcustomization of network content with privacy mechanisms such that, inone aspect of the invention, a user may control what information anentity can learn about the user's activity at other entities, and, inanother aspect of the invention, a particular entity may control whatinformation is revealed to competing entities. In a preferred embodimentof the invention, the entities are merchants.

[0009] Accordingly, the inventive techniques enable global profiles ofeach user's behavior to be maintained, so that a merchant can customizecontent for a user based on that user's activities, even at othermerchants. At the same time, however, the techniques areprivacy-preserving, in the sense that users and merchants can controlhow information about them is shared. Specifically, the inventivetechniques enable each user to control which of his or her informationcan be gathered together in a profile, and does so with naturalextensions to the user's browsing experience. It also enables eachmerchant to specify which other merchants can learn the information thatit contributes to a profile and/or other information derived therefrom.As mentioned above, existing approaches lack such data protectionmodels.

[0010] To this end, in accordance with one aspect of the invention, thepresent invention protects a user by employing the abstraction of a“persona,” or a role, as will be explained in detail below, in which auser conducts web activity. A user can have many personae, with theproperty that only the user's activities undertaken while in a givenpersona can be linked in a profile. This gives the user a convenient andnatural way to partition information about himself or herself intopersona profiles that she can selectively reveal. For example, a usermay create one persona for work, one for recreation, and one for whenhis or her children use the browser.

[0011] Further, in accordance with another aspect of the invention, thepresent invention protects entities such as merchants by employing apowerful protection model based on “tainting,” as will be explained indetail below, which offers fine-grained control over not only whichmerchants can access the records they reveal about their customers, butalso which merchants can access information derived from those records.This gives merchants the ability to specify different gradations ofaccess control for partners, competitors, and others.

[0012] In accordance with the principles of the present invention,consider the following example of the type of capabilities that may berealized based on the inventive teachings provided herein. Suppose auser purchases a ticket to Egypt at a travel web site. Later, theconsumer visits an online bookstore, which learns of the consumer'sinterests in travel and Egypt via the techniques of the presentinvention. The site thus customizes its pages based on this information,highlighting books about the pyramids, tours and travel in Egypt, etc.When the consumer visits an online electronics store, the entry pagehighlights their new Egyptian-to-English electronic pocket translator,and so on. However, at any point, the user can switch to a differentpersona profile that reflects nothing about these activities, and sothis information will not be conveyed to sites the user subsequentlyvisits. Moreover, the book store can specify that records it contributesto the profile (e.g., that the user bought books about Egyptian art) notbe made available to other book stores, since these competing bookstores could use this information to gain this user as a customer.

[0013] These and other objects, features and advantages of the presentinvention will become apparent from the following detailed descriptionof illustrative embodiments thereof, which is to be read in connectionwith the accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

[0014]FIG. 1 is a block diagram illustrating a commerce server system,e.g., a server system which runs a merchant's web site, according to anembodiment of the present invention;

[0015]FIG. 2 is a block diagram illustrating an overview of aninfrastructure according to an embodiment of the present invention;

[0016]FIGS. 3A through 3D are diagrams illustrating portions of aninterface a user may use to interact with a personae server according toan embodiment of the present invention;

[0017]FIG. 4 is a flow diagram illustrating a persona access credentials(PAC) request protocol according to an embodiment of the presentinvention;

[0018]FIG. 5 is a diagram illustrating a data structure stored with arecord according to an embodiment of the present invention;

[0019]FIGS. 6A and 6B are diagrams respectively illustrating recordreading operations supported by a profile database according to anembodiment of the present invention;

[0020]FIG. 7 is a diagram illustrating portions of a configurationinterface by which a merchant may define sets of merchants at a profiledatabase according to an embodiment of the present invention; and

[0021]FIG. 8 is a block diagram illustrating an exemplary architectureof each of the computer systems operating in the infrastructure shown inFIG. 2.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

[0022] The present invention will be explained below in the context ofthe World Wide Web, or the Internet, wherein users (in accordance withbrowsing software running on their respective computer systems) are ableto visit merchant web sites (running on one or more respective servers)in order to browse and/or buy products, services, etc. However, it is tobe understood that the present invention is not so limited. Rather, themethodologies and infrastructure of the invention may be more generallyapplied to any distributed network environment wherein users are able tovisit sites hosted by respective entities and wherein it is desirablefor the users and/or entities to have and use mechanisms that preservetheir respective privacy, at their own discretion, within the context ofglobal customization.

[0023] In order to facilitate reference to certain aspects of theinvention, the remainder of the detailed description is divided into thefollowing sections: (I) Abstractions; (II) Infrastructure; (III)Personae Management; (IV) Data Sharing Among Merchants; (V) IllustrativeApplications; and (VI) Exemplary Computer System Architecture. Also, forfurther ease of reference, certain of these sections are, themselves,divided into subsections.

[0024] I. Abstractions

[0025] In this section, we describe the abstractions the presentinvention offers to merchants and users, as well as certain exemplaryadvantages that are realized therefrom.

[0026] First, it is to be understood that the techniques of the presentinvention do not limit the collection of information that already takesplace today on the web. Preventing data collection by technical means isthe topic of numerous other research and commercial projects inanonymous or pseudonymous web access, see, e.g., M. Reed et al.,“Anonymous connections and onion routing,” IEEE Journal on SelectedAreas in Communication 16(4):482-494, May 1998; M. K. Reiter et al.,“Crowds: Anonymity for web transactions,” ACM Transactions onInformation and System Security 1(1):66-92, November 1998; E. Gabber etal., “On secure and pseudonymous client-relationships with multipleservers,” ACM Transactions on Information and System Security2(4):390-415, November 1999; and I. Goldberg et al., “Freedom Network1.0 architecture,” November 1999, the disclosures of which areincorporated by reference herein. The present invention may beconsidered as complementing this research by providing techniques forcontrolled information sharing that are compatible with existing webinfrastructure and even with anonymous web access, e.g., as implementedby the aforementioned anonymizing systems. Most anonymizing systems canbe configured to remove HTTP (HyperText Transport Protocol) cookies fromtraffic between the browser and web sites. Since, as will be explained,the present invention may preferably employ cookies, the presentinvention is compatible with these anonymizing systems when they areconfigured to not remove cookies. If cookies are not available for use,either due to an anonymizing system or because the user has disabledtheir use in his or her browser, then the inventive techniques will haveno effect and will be invisible to him or her.

[0027] Similarly, the techniques of the present invention do not includepreventing various privacy attacks that, e.g., enable a web site todirectly observe a user's activity at other web sites, see, e.g., E. W.Felten et al., “Web spoofing: An Internet con game,” Proceedings of the20th National Information Systems Security Conference, October 1997, thedisclosure of which is incorporated by reference herein. The samemeasures and precautions against such attacks may be applied by users ofthe inventive infrastructure.

[0028] Second, it is to be understood that the techniques of the presentinvention do not prevent merchants from sharing information outside theinventive infrastructure. Rather than trying to force the adoption ofsuch infrastructure by eliminating alternatives to it, the inventionoffers a more publicly acceptable and valuable infrastructure to enablesharing. As a result, the threats we consider do not admit collaborativemisbehavior by merchants to convey more information among themselvesthan is allowed by the policies of the invention. Merchants could alwaysconvey that information outside the infrastructure, and indeed riskbeing detected if they misuse the infrastructure for that purpose.Auditing compliance with the policies of the invention is discussedbelow. That said, the invention provides little or no help to merchantswho attempt to share data outside the infrastructure.

[0029] Accordingly, the invention enables each user to partitionbehavioral record-keeping by merchants into several personae profilesthat are unlinkable to those components that possess them, and tocontrol which persona profile is exposed to each merchant. This isaccomplished by separating storage of persona profiles from the abilityto link those persona profiles to a single user. For merchants thatcontribute information to persona profiles, the invention provides aprotection model for the merchant to control what other merchants canbenefit from those records. It is to be appreciated that these featuresand advantages may be realized without changing existing webinfrastructure, e.g., without the use of custom client-side software (incontrast to, e.g., the Java wallet).

[0030] In order to prevent abuse of the infrastructure of the invention,auditing may be performed in order to detect (and, thus, discourage)forms of abuse that cannot be inherently prevented, or the charging ofmodels may be implemented in order to motivate merchants to behaveappropriately. For example, since merchants sharing data outside theinventive infrastructure may not necessarily be prevented, merchants maybe subject to an audit by an organization like TRUSTe or BBBOnline as acondition of using the inventive infrastructure. Other behavior that canbe audited is the accuracy of records that merchants contribute to apersona profile, though doing so requires a different form of audit,i.e., active probing. To conduct this form of audit, an auditing agencymay play the role of a user who visits the merchant and conducts sometransaction. Afterward, the records the merchant contributed may beexamined for accuracy. To motivate merchants to contribute records atall, a price charged to merchants may be made inversely proportional tothe number of records they contribute.

[0031] (a) The User's Perspective

[0032] As mentioned above, an abstraction that the invention provides tothe user is that of having the ability to have multiple personae. Apersona represents a role in which the user engages in web activity.Examples of personae may be, but not limited to, “work,”“entertainment,” “medical,” “shopping,” “investing,” etc. The relevantfeature of a persona is that activities undertaken by the user whileacting in a given persona can be linked and profiled across sites. So,if a user visits two different sites under a “work” persona, theninformation about the user's activities undertaken at each site areavailable to the other site, provided that both sites allow this.However, if the user visits a site under a “work” persona, then the userneed not fear that his or her “entertainment” activities will becomeknown to that site.

[0033] It is to be understood that while the terms “persona” and“persona profile” may occasionally be used interchangeably herein, morespecifically, a persona represents a role a user engages in duringnetwork activities and a persona profile is a set of informationaccumulated in association with a given persona. In some cases, the term“profile” is used wherein, from its context, it is understood to referto a profile associated with a persona, as opposed to a user profile.

[0034] Because it is intrinsically difficult to prevent the correlationof two personae of the same user at a single site—e.g., the two personaecould be linked based on IP (Internet Protocol) address or even browsingbehavior—by default, the invention allows a merchant to read the profileof only one persona per user. This is achieved by granting readcredentials to a merchant for only that persona. For a different personaemployed by the user on a subsequent visit to that site, the merchant isnot given credentials to read that persona's profile. However, themerchant may still be given credentials to contribute records to thisdifferent persona, if the user permits.

[0035] Users may configure personae on various parameters, which will bedescribed below in Section III. A user selects a persona when a siterequests a persona and one has not already been selected by the user forthis browsing session. An exemplary interface by which the user conductsthis selection is also described below in Section III. A preferredtechnique of the invention is opt-in, i.e., the interface is notpresented to the user unless the user previously enrolled his or herbrowser to receive persona requests, and at any point the user maydisable a persona and later re-enable it via a simple interface. It isimportant that users be able to understand and set the policiesassociated with personae, and to easily switch between personae whenappropriate. A preferred implementation of the present invention isconstructed with such considerations in mind.

[0036] (b) The Merchant's Perspective

[0037]FIG. 1 is used to illustrate a merchant's perspective with respectto an abstraction used to preserve privacy while enabling globalcustomization in accordance with the invention. Specifically, FIG. 1shows a simplified architecture of a commerce server system, e.g., aserver system which runs a merchant's web site, according to anembodiment of the present invention. It is known that commerce serversare often constructed using database-driven templates that enable thedynamic creation of web pages. By way of example, G. W. Treese et al.“Designing Systems for Internet Commerce,” Addison-Wesley, Reading,Mass., 1998, the disclosure of which is incorporated by referenceherein, describes techniques for designing systems used in Internetcommerce. Such web page templates are written in a template language andstored in the web server file system. An exemplary web server filesystem, such as a commerce server system, is illustrated in FIG. 1. Inparticular, the system shows a web server 10, web page templates 12 anddatabases 14. As is known, the template language offers primitives forposing queries 16 to the databases 14, performing computation, andrendering HTML (HyperText Markup Language). Thus, generally, when theweb page is requested (step 101), the web page template is interpretedto render a web page (step 102) based on information retrieved from thedatabases 14 in accordance with a catalog database 18 (part of step103).

[0038] Advantageously, as shown in FIG. 1, the present inventionaugments a commerce server file system with another “database” permerchant, called a Global Customization Engine (GCE) 20. Conceptually,the GCE serves as another database that web page templates 12 can query.However, rather than being a database of only local information, the GCEinteracts with remote components of the infrastructure of the invention(e.g., profile databases or PDBs, as will be explained in detail below)to obtain web history information about (the persona of) a visitor tothis site and to contribute information about this visitor. Web pagetemplates 12 query the GCE 20 (part of step 103) for information aboutthe visitor, and they or other components (e.g., Common GatewayInterface or CGI scripts) insert records about this persona at the GCE(also part of step 103). The GCE may propagate these records to othercomponents of the infrastructure of the invention and eventually toother merchants, as will be explained below.

[0039] An interface between the merchant site and the GCE enables themerchant site to register an identifier of the merchant's choice alongwith a “persona access credential” (PAC) that is passed to the merchantsite if the user's persona management policy allows. From then onward(until the PAC expires), web page templates can query the GCE using thechosen identifier. The GCE uses the PAC to retrieve information from theinfrastructure of the invention about the persona associated with thecorresponding PAC. The PAC also enables the merchant to contributeinformation about the visitor to the infrastructure of the invention.When the merchant site inserts records at the GCE, the merchantspecifies access control information that constrains what othermerchants can read these records or records derived from them. Anillustrative data protection model for accomplishing this aspect of theinvention will be described below in further detail in Section IV.

[0040] In one preferred embodiment of the invention, a GCE may beintegrated with a commercial commerce server such as an iMerchant Pro2.0 (made by Premium Hosting Services, Inc.). The illustrative commerceserver supports a web page template language called iHTML, via which webpages pose queries to the GCE. In such an embodiment, the merchantregisters a PAC with a customer identifier that it also sets as an HTTPcookie in the user's browser for the current browsing session. In thisway, when the site gets an HTTP request from that user, it can pass theassociated cookie to the GCE to obtain information about the (persona ofthe) visitor.

[0041] II. Infrastructure

[0042] In this section, we describe an illustrative overview of aninfrastructure of the invention according to an embodiment of theinvention, in the context of FIG. 2, which supports the interfacesmentioned above in Section I and which will be further described below.

[0043] As shown in FIG. 2, the infrastructure 200 of the inventioncomprises: a user computer system 202 which executes browser software;merchant web site server systems 204-1 through 204-M; a personae server206; and profile databases (PDBs) 208-1 through 208-N. The components ofthe infrastructure 200 are operatively coupled via a network 210 which,in this embodiment, is the Internet.

[0044] The user computer system 202 is the computer system through whicha user accesses the merchant web sites during his or her online shoppingendeavors. It is also the computer system through which the useraccesses the personae server 206 to request and specify parameters ofvarious personae that he or she wishes to operate under while visitingvarious merchant web sites. An exemplary interface with the personaeserver 206 is described in the next section. It is to be understood thatthe user accesses these features through the browser software running onhis or her system. One advantage of the infrastructure of the inventionis that the browser software need not be modified to operate in theinfrastructure. Further, it is to be understood that theprivacy-preserving global customization techniques of the invention maybe implemented within an existing network environment such as theInternet.

[0045] Each of the merchant web site server systems 204-1 through 204-Mis a commerce server file system with which the user's computer systemrespectively communicates with while visiting the site. Each web siteserver system is configured with a GCE, as shown in FIG. 1. FIG. 2illustrates M server systems, where M may be any number of web sites onthe network which are configured to operate in accordance with theinventive infrastructure.

[0046] The personae server 206 resides in the network to support themanagement of user personae and the issuance of PACs. Each user whoemploys the inventive infrastructure holds an account at the personaeserver. This account allows the user to create new personae and managepolicies for existing personae. Users must trust the personae server toaccurately enforce the policies the user specifies for her personae, andto not disclose relationships between personae and users to merchants.In order to scale, in one implementation, the personae server may be avirtual server with one domain name. This name may be dynamically mappedto an actual personae server depending on a range of criteria, includingthe proximity of the server to the client, the current load andavailability of servers, etc. Techniques for implementing virtualservers and the dynamic mapping of DNS (domain name system) queries toactual servers is well known in the art and, therefore, will not bedescribed in further detail herein. One example of virtual servertechniques that may be employed are those used by Akamai.

[0047] The profile databases, or PDBs, each may contain records insertedby merchant server systems 204-1 through 204-M (via their respectiveGCEs) about different personae. As shown, there may be numerous,unrelated PDBs in the infrastructure 200. N represents any number ofPDBs which are desirable to support the abstractions of the invention.It is to be understood that there does not have to be, and likely isnot, the same number of web site server systems 204 as there are PDBs208. A merchant chooses the PDBs to which it inserts records as those ittrusts to enforce the data protection policies that the merchantspecifies. PDB support for merchant data protection will be described indetail below in Section IV. Users must trust the PDBs of the merchantsto which it provides PACs to limit merchants to the forms of accessspecified in those PACs, as will be explained in detail below in SectionIII. However, since users may not be aware of the PDBs a merchant uses,this trust may need to be gained with, e.g., the assistance of anauditing body, examples of which were previously described.

[0048] It is to be understood that, as shown in FIG. 2, a user'spersonae server is separate from the merchant servers that the uservisits, as well as from the profile databases the merchants use. Sincethe personae server stores the correspondences between personae andusers, joining the persona server with profile databases may enableconstruction of a profile per user—as opposed to per persona. Thus, thepersonae server is preferably established as a privacy preserving sitedevoted to this purpose. PDBs may be offered by service providers,particularly as a value-added feature for commerce server hosting.

[0049] The type of data that merchants insert into PDBs is preferablylimited to information about what a user acting under a particularpersona did while at their web sites. In particular, the inserted datapreferably excludes information that could be used to link two personae,such as the IP address from which the user visited or any otheridentifying information like an email address. Note that the decision todisallow multiple personae to be read by any merchant by default takesaway incentive to do otherwise: a single merchant, even if in theory itcould link two personae to the same user, is not given PACs to read datafor both personae. This restriction on the type of data merchants insertthus primarily serves to prevent PDBs from linking personae associatedwith the same user.

[0050] The invention implements a protocol by which a merchant siterequests a PAC for a persona from the personae server, the personaeserver issues that PAC, and the merchant uses it (via its GCE) to reador insert information about the persona to a PDB. This protocol isdescribed in detail below. This protocol requires user input only in thecase that there is no current persona for the user. The interface thatthe user experiences in this case is described in detail below inSection III.

[0051] III. Personae Management

[0052] As already discussed, personae are the basic tool by which userspartition their behaviors into profiles. A main challenge toimplementing personae is to enable the user to easily configure his orher personae with the desired policies for protecting his or herprivacy, and in some cases to make policy decisions for the user so thatmanaging personae is not a burden.

[0053] (a) Persona Configuration

[0054] The policies that describe how personae are managed and how PACsare distributed can significantly impact how a user's data is shared.Illustrative ones of these policies, and how they can be configured, aredescribed below.

[0055] (i) Rights conveyed with PACs. As described in the previoussection, a PAC granted to a merchant enables that merchant to access theinformation in a PDB associated with the persona named in this PAC. Withone exception described below, by default, a PAC conveys “read” rights,which enable the merchant to read records in the PDB associated with thepersona, and “insert” rights, which enable the merchant to insert newrecords about that persona. However, a user could grant only one ofthese to a merchant. For example, a user may grant a site only readaccess if the user does not want his or her activities at that siteadded to his or her profile associated with the particular selectedpersona. The user may grant only insert access if the user does not wantthe site he or she is visiting to learn his or her other profiled data,but the user is comfortable with that site adding data to the profile. Athird type of access can be granted: “delete” rights, which enable themerchant to delete records associated with the persona from the PDB.Delete rights make it possible to set up a monitoring site that userscan visit to review the information stored about their personae in a PDBand delete records of their choosing.

[0056] (ii) Exposure of multiple personae for the same user at onemerchant. As previously discussed in Section I, granting PACs to amerchant with read rights for two different personae of the same userpotentially enables the merchant to “merge” the personae profiles, evenif the PACs are sent to that merchant in two different sessions. Forthis reason, it is preferred that a default policy be adopted that amerchant site be granted read rights to only one persona per user,namely, the first persona under which the user visits the site. Thispolicy, however, may be limiting in certain cases. For example, many websites may naturally be visited by the same user in different personae,such as search engines and portal sites that may serve as general“launch points” for content regardless of what type of content issought. Allowing only one persona to be read by each of these sites maylimit the amount of customization that site can perform.

[0057] (iii) Duration of a persona as a default. When a user selects apersona in which to browse, that persona preferably becomes the default,or “current,” persona for some period of time, in order to minimizeinterruptions in the user's browsing experience. A configurableparameter of a persona is the length of this duration. The defaultsetting for this parameter is the duration of the browsing session,i.e., until the user closes his or her browser. Other alternatives are aspecified time period (e.g., 30 minutes), or simply to not make thepersona a default at all. A persona, even if the default, can be changedby the user and will not be made readable to a site if that sitepreviously was sent a PAC containing read rights for a different personaof the same user.

[0058] (iv) Duration of PACs. The duration for which a PAC (and theaccess rights it conveys) is valid can have significant ramifications touser privacy. On one end of the spectrum, a PAC granting read accessthat is valid indefinitely enables the site that receives it to monitorthis persona arbitrarily far into the future. On the other end of thespectrum, a PAC may be limited for use only within a very tight timeframe, perhaps only for a minute or so before it has to be renewed.Here, the tradeoff involves the additional overhead of frequentrenewals, but the benefit to the user is fine-grained control over theduration for which he or she can be monitored (in the case of readaccess) or data about him or her can be added (in the case of insertaccess). In a preferred implementation, a short duration period for PACsby default is adopted, in order to better protect the user's privacy.

[0059] (b) PAC Format

[0060] Persona access credentials (PACs) are granted by a personaeserver to a merchant to enable the merchant to read, insert and/ordelete records for this persona. In accordance with a preferredembodiment of the invention, a PAC is a structure containing thefollowing fields:

[0061] 1. An identifier for the merchant to which the PAC is beingissued. This identifier is used by PDBs to verify that the merchantpresenting a PAC is the same merchant to which that PAC was granted.This identifier is the public key that PDBs use to authenticate requestsfrom the merchant. This public key must be conveyed to the personaeserver within a certificate that is appended to the PAC request andsigned by a certification authority known to the personae server.

[0062] 2. An expiration time. This time is calculated as a function ofthe PAC duration as described above in subsection (a).

[0063] 3. Access rights. By default, these include both read and insertpermissions, or only insert permission if this PAC is being issued to amerchant to which a PAC containing read permission for another personaof the same user was previously issued. However, it is possible that theuser might choose a different configuration of access rights, possiblyincluding the additional delete permission.

[0064] 4. A digital signature on the above items. When a persona iscreated at the personae server, the server creates a new public key pairfor the persona. That private key is used to sign all PACs for thatpersona.

[0065] 5. The persona public key. The public key matching the privatekey used to sign the PAC is sent with the PAC. This public key serves asthe long-term identifier for the persona.

[0066] A PDB verifies a PAC accompanying a merchant request by firstverifying its signature using the public key contained in the PAC (i.e.,the persona public key), and verifying that the PAC has not expired. Itthen compares the access rights granted in the PAC to the request thatthe merchant is making, to determine whether it should grant thisrequest. If the request is allowed, the PDB performs the request on thedata associated with the persona public key; i.e., this public key isused as the index for a persona's data. Note that the persona public keyneed not be certified in any way. If the merchant forges a PAC using adifferent public key, then it is merely posing queries to a nonexistentpersona for that user.

[0067] In a preferred implementation, persona public keys may preferablybe RSA keys (as described in R. L. Rivest et al., “A method forobtaining digital signatures and public-key cryptosystems,”Communications of the ACM, 21(2):120-126, February 1978, the disclosureof which is incorporated by reference herein) with 1024-bit moduli.

[0068] (c) A User Interface

[0069] In this subsection, some portions of an illustrative interfacethe user may use to interact with the personae server are shown. Thepersonae server presents this interface to the user, via his or hercomputer display, when a merchant requests a persona and there is nocurrent persona for the user at the personae server. When this happens,a new, smaller browsing window 302 is presented to the user, as shown inFIG. 3A. This offers the user three options from which to select byclicking thereon, namely, selecting a persona (304), denying thisrequest for a persona (306), or denying all persona requests untilfurther notice (308).

[0070] If the user chooses the second or third options (306 or 308),then this window immediately disappears. Note that in this case, thepersonae server need not know who the user is, and the user need noteven have an account with the personae server. In this case, however,choosing the third option (308) denies all persona requests for anyoneusing this browser, rather than only for this user. So, the user whodoes not want to be bothered with personae management can disable thesystem easily. If the user chooses to deny all personae requests untilfurther notice (308), then the user must visit a URL (uniform resourcelocator) at the personae server in order to re-enable persona requeststo his or her browser.

[0071] If the user chooses the first option (304) and has not previouslylogged in during this browsing session, the user proceeds to a loginscreen 310 shown in FIG. 3B. If the user does not already have anaccount at the personae server, he or she can create one by checking theappropriate box (312). Otherwise, the user simply logs in by enteringhis or her account number (314) and password (316), without checking thebox.

[0072] After logging in, the user can create new personae or select analready-existing one. This may be done in accordance with screen 318 andpersona creation/selection menu 320, as shown in FIG. 3C. If the userselects an already-existing persona, then this window now disappears anda PAC for that persona is issued to the merchant. If the user chooses tocreate a new persona, then the screen 322 shown in FIG. 3D appears. Thisallows the user to choose a name for the persona (324) and configuresome basic parameters for the persona. For example, the user may specifythat, once selected, the persona should be accessible to: only the sitefor which the user selected it (326); any site the user visits during abrowsing session (328); or any site the user visits in the next nminutes (330), where n is also selectable. The user may also specifythat he or she be asked every time before exposing this particularpersona to a site (332).

[0073] It is to be understood that there may be other screens (notshown) associated with the personae server that include interfaces tomodify other parameters of personae (as described above in subsection(a)), and interfaces for disabling or changing current personae.

[0074] (d) A PAC Request Protocol

[0075] Referring now to FIG. 4, in accordance with the presentinvention, an illustrative protocol is shown by which a merchant site Mrequests a PAC for a persona from the personae server P, P issues thatPAC, and Muses it (via its GCE) to read or insert information about thepersona to a PDB D. It is to be understood that, with reference to FIG.2, the merchant site M represents a merchant site server system 204, thepersonae server P represents personae server 206, and PDB D represents aPDB 208.

[0076] The protocol begins by the user directing his or her browser U tothe merchant site as usual (step 401). It is to be understood that Urepresents user system 202 in FIG. 2. At any point in M's interactionwith U, M may redirect U to a well-known CGI script on the personaeserver P (step 402). This redirection need not preclude the merchantfrom presenting a page to the user; e.g., the redirection may be in ahidden frame in the user's browser. Appended to this URL are argumentsincluding a URL at site M to where the PAC is to be sent. For example,an HTTP redirection message may be used. Moreover, in the HTTP headersof this redirection message, M sets a cookie at U that includes acustomer identifier C. So, C will be returned to the server on eachsubsequent communication from U.

[0077] The message sent in step 402 prompts U to automatically issue arequest to P for this URL (step 403). If U has not authenticated to Precently or does not have a persona already selected for this browsingsession, P responds to the user with a new window indicating that M isrequesting a persona for the user, and enabling the user to log in andselect one (step 404) as illustrated above in the context of FIGS. 3Athrough 3D. Moreover, P queries the user only if the request in step 403is accompanied by an HTTP cookie (not shown) indicating that a userpreviously enrolled this browser to respond to persona requests and hasnot since disabled them. In this sense, the illustrative technique is an“opt-in” technique. Once a persona is selected, P generates a PAC for M,according to the persona that the user selected, and redirects U to thereturn URL on site M with PAC appended (step 405). Along with step 404,P can set a cookie at U so that this login and selection procedure neednot be repeated for each site. For example, if this cookie is set to bein effect for the duration of this browsing session, then this typicallywill be the last time the user will have to go through this personaselection process during this browser session.

[0078] The message sent in step 405 causes U to forward the PAC to M,accompanied by the customer identifier C that M previously set as acookie at U(step 406). M can forward this pair to its local GCE (asillustrated above in the context of FIG. 1) and then pose queries aboutcustomer C, which the GCE translates into queries to D with PAC appendedto show that M is authorized to make such queries (step 407).

[0079] When the user visits another merchant, that merchant may requesta persona using the same protocol. In this case, the entire protocolabove is executed transparently to the user, i.e., step 404 is skipped.

[0080] IV. Data Sharing Among Merchants

[0081] Data sharing among merchants takes place by merchants insertingrecords into, and reading records from, a PDB via their respective GCEs.For the purposes of this section, we denote the merchant who insertedthe record α by merchant(α), and the persona (i.e., the persona publickey, as mentioned above in section 111(a)) to which the record pertainsas persona(α). For ease of explanation, we do not distinguish betweenthe merchant site and its GCE in this section.

[0082] Just as users have privacy concerns that must be addressed in theinventive infrastructure, so do merchants. Specifically, a merchant maynot want to insert records into the PDB if a competing merchant can usethis information, directly or indirectly, to tailor content to the sameuser if that user happens to visit the competing merchant. Thus, for theinfrastructure of the invention to be adopted by merchants, it isimportant that mechanisms be provided to protect the information thatthey insert into the system.

[0083] (a) A Tainting Data Protection Model

[0084] The data protection model provided by the invention for this taskis based on information flow models, specifically tainting. Intuitively,one datum in the system taints another if the value of the second wasinfluenced by the value of the first. A tainting model enforces thepolicy that if α taints α′, then α′ can be used only in ways that α hasbeen authorized to be used. So, for example, if the owner of α specifiedthat it not be disclosed, then α′ cannot be disclosed either. Thegeneral idea for using tainting to protect merchant data in theinventive infrastructure is that for each record α that a merchantinserts into the PDB, the merchant specifies sets of other merchants towhich it will allow that record, or anything that record taints, toflow. So, for example, if a merchant reads α and uses it to customizepages for a user, and then the merchant inserts a record α′ based on theuser's subsequent behavior (e.g., perhaps the user bought what themerchant displayed), then α′ can be read only by merchants that themerchant who wrote α allows it to.

[0085] However, this general model is preferably refined. A primaryreason is that if a data item taints records arbitrarily far in thefuture by default, this will prevent much data sharing among merchants,usually unnecessarily. For example, consider the scenario outlined abovein which a user purchases travel to Egypt and consequently is offered,and buys, books about pyramids from an online bookstore. Now suppose theuser visits an online home furnishings store, which offers the user areading lamp because it learns of the user's interest in reading fromthe records inserted by the book store. In this example, it wouldtypically be unnecessary that records inserted by the home furnishingsstore, indicating the purchase of a reading lamp, be withheld from othertravel stores that the user visits merely because records inserted bythe first travel store are contained in their causal history.

[0086] We therefore enrich the model by requiring a merchant to specifytaint classes for each record α that it inserts. Abstractly, themerchant specifies a sequence of sets CLASS_(α)[0], CLASS_(α)[1], . . ., CLASS_(α)[STR(α)+1], where each CLASS_(α)[i] is a subset of merchants,CLASS_(α)[i]εCLASS_(α)[i+1], CLASS_(α)[STR(α)+1] is the universe of allmerchants, and STR(α) is a nonnegative integer called the taint strengthof α. Intuitively, if merchant m is not a member of CLASS_(α)[i], thenit is not allowed to read records that were derived from α by a sequenceof i or fewer derivations. More precisely, suppose we define a relation→as follows:

α→α′ if and only if

merchant(α)≠merchant(α′)Λ  (1)

persona(α)=persona(α′)Λ  (2)

merchant (α) read α′ before inserting α  (3)

[0087] Now consider the directed acyclic graph formed by the → relation,i.e., where nodes are records and edges correspond to the → relation.For records α, α′ and merchant m, if m∉CLASS_(α)(i) and there is a pathof length i or less from α′ to α, then m cannot read α′.

[0088] A merchant makes use of this model by specifying sets ofmerchants when it registers with a PDB and then referring to those setsto construct the taint classes for records it inserts. For example, amerchant may designate a set M_(partners) of partner merchants with whomit is willing to share data generously, and a set M_(noncompetitors) ofmerchants that are neither partners nor competitors. Then, when itinserts a record α, the merchant might specify STR(α)=1,CLASS_(α)[0]=M_(partners) and CLASS_(α)[1]=M_(partners∪M)_(noncompetitors). That is, only partners can read record α, and onlypartners and noncompetitors can read records α′→α (and only ifmerchant(α′) consents). In particular, competitors of merchant(α) canread neither.

[0089] An algorithm to enforce the policy expressed by taint classes isas follows. Stored with each record α is a data structure 500, asillustrated in FIG. 5, which includes: (i) an integer value (502) calledthe accumulated taint strength of α, denoted ATS(α),; (ii) the sets(504) CLASS_(α)[1], . . . , CLASS_(α)[STR(α)]; and (iii) pointers (506)to the records α′ such that α·α′. In addition, the data structure 500also includes customization information (508) for the data record, e.g.,“Egypt,” and a taint strength value (510) STR (STR being an integer).When a record α is inserted, the accumulated taint strength is computedas:${{ATS}(a)} = {\max \quad \left\{ {{{STR}(a)},{{\max\limits_{a^{\prime}:{a\rightarrow a^{\prime}}}\quad \left\{ {{ATS}\left( a^{\prime} \right)} \right\}} - 1}} \right\}}$

[0090] To determine whether a merchant m can read α, the PDB executes abreadth-first search from α in the graph defined by →, truncating eachdescending traversal once when it encounters a record α′ where ATS(α′)is less than the current depth in the search. For each record α′ visitedat depth d in this traversal, m is allowed to read α only ifmεCLASS_(α)[d]. A main result of this algorithm is:

[0091] If m∉CLASS_(α)[i] and there is a path of length i or less from α′toαa, then m cannot read α′.

[0092] To insert a record α with out-degree B (i.e., there are B recordsα′ such that α→α′), the computation required is O(B). Determiningwhether a merchant can read a record takes O(E(log R+log M)) time ifthere are a total of M merchants, R records for this persona, and Eedges among these records. However, the computation time is much lessfor reasonable taint strengths. In particular, if a maximum taintstrength per record were imposed, then the breadth-first traversal willstop by the depth of that strength. As described above, it is assumedthat the sets comprising the taint classes for a record are previouslyspecified sets that categorize merchants relative to the insertingmerchant. In this case, a record α requires storage of only O(B+C)pointers over and above the (one-time) storage of these merchantcategories if there are C merchant categories.

[0093] In accordance with this inventive approach, a merchant can changetaint classes for a record even after inserting that record. However, tosupport changes that increase the taint strength of the record, eachrecord α stores pointers to all records α′ such that α′→α. Then, if themerchant changed the taint classes of a record α in a way that increasesSTR(α), the PDB recomputes ATS(α) and performs a depth-first traversalto depth ATS(α) on the DAG defined on the inverse of →, starting at α.For each node α′ visited in this traversal, ATS(α′) is updated ifnecessary.

[0094] Alternatively, to improve efficiency and minimize unnecessarytainting even further, the tainting model of the invention may “expire”taint over time. One such alternative is, when a merchant inserts arecord α, to record pointers α→α′ only to a fixed number of records α′most recently read by merchant(α). In this way, a record α′ willeventually no longer taint records written by a merchant, if themerchant does not read α′ again.

[0095] (b) Reading Records

[0096] The records that a merchant reads is a primary factor indetermining the taint properties of records that merchant inserts, seeclause (3) in the relation defined above in subsection (α). In order tominimize unnecessary tainting, it is important that merchants read onlyrecords that are directly relevant to the customization decisions theymake. The present invention thus provides a read interface for the PDBthat makes it possible for merchants to be very targeted in the recordsthey read.

[0097] The PDB interface for reading records supports two types ofoperations. These operations are respectively illustrated in FIGS. 6Aand 6B. The first operation 602, here called create_list, takes as itsarguments a PAC and a scoring function specified by the merchant. Thescoring function ƒ accepts as input a single record and returns afloating point value, called a score. Intuitively, for a record α, thescore ƒ(α) indicates α's relevance to the customization decision thatthe merchant must make, as determined by the scoring function ƒ. Forexample, a reasonable scoring function might return higher scores formore recent records, records that indicate large purchases by thevisitor, or records that match the merchant's inventory well. In anillustrative implementation, the scoring function is a Java class filethat the merchant administrators craft, and that is required toimplement a function with no side effects (i.e., no networkcommunication, disk accesses, etc.).

[0098] The create_list operation applies the scoring function ƒ to allrecords to which the merchant has access for the persona indicated bythe PAC. The return value from create_list(PAC, ƒ) is a reference L to alinked list of records sorted by descending scores, stored at the PDB.Importantly, invoking the operation create_list does not “count” asreading records, since the reference L that it returns does not indicateinformation about the content of records, their scores, or even how manyrecords are in the resulting linked list stored at the PDB.

[0099] The only operation available to the merchant using the referenceL is to invoke next(L). This operation, denoted as 604 in FIG. 6B,initially returns the record at the head of the list, and whensuccessively invoked it returns the next record in the linked list. Eachrecord returned to the merchant is marked as having been read by themerchant, for the purpose of determining the records α such that α′→αfor the records α′ the merchant inserts. The merchant can sample thefirst few records of L to determine whether they suit the merchant'sneeds. If so, these can be used to customize content for the visitor. Ifnot, the merchant site may form a new list by invoking create_list witha different scoring function. This interface requires the merchant toread very few records per visitor in order to customize its content,thereby limiting unnecessary tainting.

[0100] (c) On Accessing Multiple PDBs

[0101] As described in section II above, the inventive infrastructureallows multiple PDBs, and further allows a single merchant to subscribeto multiple PDBs as it chooses. It is thus possible that a record at onePDB will be tainted by a record at another PDB. More precisely, wheninserting a record α to a PDB D, the merchant's GCE propagates to D areference to each record α′ at another PDB such that α→α′. The graphtraversals in the algorithms of subsection (a) above may then requirecommunication across PDBs to complete. If a needed PDB is unreachable,the algorithms can respond conservatively: e.g., in the case ofdetermining whether a merchant can read a record, if the PDB at which anecessary record α′ resides is unavailable, then the merchant isdisallowed.

[0102] We note that placing responsibility on merchant GCEs to propagatethis taint information poses minimal risk to the enforcement of taintingpolicies. First, there is little motivation for a merchant writing arecord α′ to suppress the fact that α′→α; doing so merely decreases thedegree to which α′ is tainted. Second, the fact that merchant(α′) read αmeans that merchant(α′)εCLASS_(α)[0]. That is, merchant(α) alreadytrusts merchant(α′) with α, and so trusting merchant(α′) to propagatethe fact that α′→α extends this trust minimally. Third, since the PDBstoring α maintains the time at which α was read, and the PDB storing α′similarly records the time at which α′ was inserted, such suppression isreadily detected in an audit involving both PDBs. Thus, communicatingrecords outside the infrastructure is a less risky approach to violatingthe inventive tainting model, consistent with the advantages describedabove in section I.

[0103] (d) Merchant Taint Class Configuration

[0104] The tainting model described above gives each merchantfine-grained control over where its records, and information derivedfrom them, flow. The merchant m exercises this control by specifyingtaint classes on each record it writes, which for convenience willusually be composed of sets of merchants that m previously defined—e.g.,the M_(partners) and M_(noncompetitors) described above. In thissubsection, we describe a configuration interface by which m can definesuch sets of merchants at a PDB.

[0105] A portion of this interface is shown in FIG. 7. In this figure,the merchant “Genesis Sport” is configuring a group that it calls“noncompetitors,” as indicated in the heading of the page. Shown in thelower right screen quadrant 702 is the taint class (here called a moreuser-friendly name, “collaboration affinity”) in which the merchants inthis group are included by default when Genesis Sport writes a record tothis PDB. As shown, Genesis Sport by default gives its noncompetitorsimmediate access to the records it writes, as indicated by specifying ataint class of 0 for them. The noncompetitors of Genesis are listed inthe lower left screen quadrant 704. For example, Genesis includes bookstores in its list of noncompetitors, but does not include “OceanDiveshop.”

[0106] The upper two screen quadrants make the task of formulating thenoncompetitors list easier for Genesis. The upper left screen quadrant706 contains a list of all merchants registered to use this PDB. Genesiscan select individual merchants to add to its list. In addition, Genesiscan choose categories of merchants to include or exclude from its listusing the upper right screen quadrant 708. The category of a merchant isspecified by the merchant when it is registered to use this PDB (Genesisitself is a diving store, as indicated in the “Business groups” headingof the page). Since this categorization may not be entirely reliable,adding a category of merchants simply lists the new merchants in thelower left screen quadrant 704. Genesis can then inspect the merchantsthat were added, before committing these additions to its noncompetitorslist.

[0107] V. Illustrative Applications

[0108] It is to be appreciated that the design of the illustrativeimplementation of the infrastructure of the invention was influenced bya focus on the business-to-consumer market. For example, this ismanifested in that the invention operates with unmodified clientbrowsers (e.g., Netscape, Internet Explorer, etc.) where it is knownthat relying on user installation of new software can be a barrier toadoption. It is also manifested in the attention given to user privacy.However, the principles of the invention may be applied in certainbusiness-to-business (B2B) settings, as well.

[0109] One application of our design in B2B settings is in so-called“ScenarioNets,” which is a model of interaction to which some B2Be-markets are evolving. Seybold et al. (in Seybold et al. “Understandingthe B2B and E-Market Landscape,” Customer.com Focused ResearchCollection, Patricia Seybold Group, Inc., 2000, the disclosure of whichis incorporated by reference herein, at pg. 36-39) define a ScenarioNetto be a customer- and project-specific set of interrelated tasks thatcan be performed across web sites and suppliers to accomplish a specificoutcome. The importance of “customer- and project-specific” is that thesequence of interrelated tasks may be so customized to the customer andproject that it is not anticipated or directly supported by a verticalor horizontal e-market. Seybold et al. suggest supporting ScenarioNetsby providing a way for the customer to carry the context of previouslycompleted tasks from one web site to the next, so that already-enteredinformation and results of already-completed tasks are available to thenext sites and applications in the sequence. The infrastructure of theinvention can support ScenarioNets in this way, where the user employs apersona per sequence of tasks. The inventive infrastructure providesboth the techniques for context to be carried from one step to the nextand mechanisms to protect the sensitive information of both the user andweb sites involved in the sequence of tasks. And, in contrast to thesupport offered by GroupWare systems, the inventive infrastructure neednot be configured with advanced knowledge of the sequence of relatedtasks.

[0110] Since certain B2B settings may be more amenable to theintroduction of custom client software, in one embodiment, the clientsoftware may embody the persona server for this user, or even the PDBcontents themselves. However, the latter organization would centralizeall data in a way that reveals a single profile for the user if thiscentralized store were compromised.

[0111] VI. Exemplary Computer System Architecture

[0112] Referring now to FIG. 8, an exemplary architecture is illustratedfor each computer system communicating over the network. Thus, it is tobe understood that the exemplary architecture in FIG. 8 may representthe architecture of each of the computer systems operating in theinfrastructure shown in FIG. 2, i.e., the user computer system 202, themerchant web site server systems 204-1 through 204-M, the personaeserver 206, and the PDBs 208-1 through 208-N. As mentioned, the personaeserver may be a virtual server. Also, each merchant server file systemand/or PDB may include one or more such computer systems.

[0113] As shown, each computer system may comprise a processor 802,memory 804, and I/O devices 806. It should be understood that the term“processor” as used herein may include one or more processing devices,including a central processing unit (CPU) or other processing circuitry.Also, the term “memory” as used herein is intended to include memoryassociated with a processor or CPU, such as RAM, ROM, a fixed memorydevice (e.g., hard drive), or a removable memory device (e.g., disketteor CDROM). In addition, the term “I/O devices” as used herein isintended to include one or more input devices (e.g., keyboard, mouse)for inputting data to the processing unit, as well as one or more outputdevices (e.g., CRT display) for providing results associated with theprocessing unit. Accordingly, software program instructions or code forperforming all or portions of the methodologies of the invention,described herein, may be stored in one or more of the associated memorydevices, e.g., ROM, fixed or removable memory, and, when ready to beutilized, loaded into RAM and executed by the CPU.

[0114] Accordingly, as described above in detail, the present inventionprovides techniques and infrastructure for supporting globalcustomization. The invention enables persona profiles of userinformation to be maintained, and such persona profiles to be accessedby merchants. Via the persona abstraction, users control whatinformation is grouped into a persona profile, and can selectivelyenable a merchant to read one of these profiles. The infrastructure ofthe invention employs a persona server that assists users in managingtheir personae. The infrastructure of the invention separates this fromthe profile databases at which persona profile information is stored, toeliminate any single point at which different persona profiles can betied to the same user. Since merchants also have privacy concerns, theinfrastructure of the invention provides a data protection model basedon tainting, by which merchants can limit how the information theycontribute can be exposed.

[0115] Although illustrative embodiments of the present invention havebeen described herein with reference to the accompanying drawings, it isto be understood that the invention is not limited to those preciseembodiments, and that various other changes and modifications may beaffected therein by one skilled in the art without departing from thescope or spirit of the invention.

What is claimed is:
 1. A method for use in a distributed data networkwherein a user may request and receive content from one or more entitiesin the distributed data network, the method comprising the steps of:providing one or more mechanisms for enabling at least one of the userand one or more of the entities to control which entities in thedistributed data network have access to information generated inassociation with the user's activity on the distributed data network;and customizing content to be received by the user in accordance with atleast a portion of the accessible information.
 2. The method of claim 1,wherein the step of providing the one or more control mechanisms for theuser comprises the step of enabling the user to specify two or moreroles within which the user may perform activities on the distributeddata network.
 3. The method of claim 2, further wherein the two or moreroles have two or more profiles respectively associated therewith. 4.The method of claim 3, further wherein the two or more profiles aresubstantially unlinkable.
 5. The method of claim 4, wherein thesubstantial unlinkability of the profiles substantially prevents anentity from learning about the user's activity at another entity, whenthe user conducts activities at the different entities in the differentroles.
 6. The method of claim 2, wherein the roles are specified inaccordance with at least one dedicated server located in the distributeddata network.
 7. The method of claim 1, wherein at least one of the oneor more entities are merchants operating on the distributed datanetwork.
 8. The method of claim 1, wherein the step of providing the oneor more control mechanisms for the one or more entities comprises thestep of enabling the one or more entities to specify which otherentities are able to access information that the one or more entitieslearned in association with the user conducting activities with the oneor more entities.
 9. The method of claim 8, further wherein the one ormore entities are enabled to specify which other entities are able toaccess information derived from original information that the one ormore entities learned in association with the user conducting activitieswith the one or more entities.
 10. The method of claim 9, wherein theone or more entities are enabled to specify a degree of informationderivation in accordance with which other entities may be able to accessthe information.
 11. The method of claim 10, wherein the one or moreentities are enabled to group the other entities into one or moreclasses wherein each class has a degree of information derivationassociated therewith.
 12. The method of claim 1, wherein the one or moreentities access the information in accordance with one or more dedicateddatabases located in the distributed data network.
 13. A method for usein accordance with at least one server in a distributed data networkwherein a user may request and receive content from one or more entitiesin the distributed data network, the method comprising the steps of:maintaining two or more user-specified policies respectively associatedwith two or more roles within which the user may perform activities onthe distributed data network; and issuing access credentials associatedwith the user-specified policies to one or more entities that seek toaccess information generated in association with the user's activity onthe distributed data network so as to customize content to be receivedby the user in accordance with at least a portion of the accessibleinformation.
 14. The method of claim 13, wherein the access credentialscomprise rights by which the entity may access the information.
 15. Themethod of claim 14, wherein the access rights comprise at least one ofinformation read rights, information insert rights and informationdelete rights.
 16. The method of claim 14, wherein the accesscredentials further comprise an identifier of the entity to which theaccess credentials are being issued.
 17. The method of claim 14, whereinthe access credentials further comprise an expiration time specifying aduration of the access rights.
 18. The method of claim 14, wherein theaccess credentials further comprise a digital signature on the accesscredentials.
 19. The method of claim 18, wherein the access credentialsfurther comprise a public key matching a private key by which the accesscredentials have been digitally signed.
 20. The method of claim 13,wherein the maintaining step further comprises prompting the user tospecify a new role or an existing role within which the user may performactivities on the distributed data network
 21. A method for use inaccordance with one or more databases in a distributed data networkwherein a user may request and receive content from one or more entitiesin the distributed data network, the method comprising the steps of:storing information that the one or more entities learn in associationwith the user conducting activities with the one or more entities; andenabling the one or more entities to specify which other entities areable to access the stored information so as to customize content to bereceived by the user in accordance with at least a portion of theaccessible information.
 22. The method of claim 21, wherein theinformation that the one or more entities learn in association with theuser conducting activities with the one or more entities comprises atleast one of original information and information derived from theoriginal information.
 23. The method of claim 21, wherein the enablingstep further comprises enabling the one or more entities to specify oneor more taint classes for portions of the stored information.
 24. Themethod of claim 23, wherein a given taint class corresponds to anaffinity an entity has to collaborate with entities in the given taintclass.
 25. The method of claim 23, wherein at least portions of theinformation are respectively stored as records, wherein each record hasstored in association therewith a data structure comprising at least oneof an accumulated taint strength, a set of taint classes, and pointersto one or more original records from which this record was derived. 26.The method of claim 25, wherein an entity is not permitted to read arecord derived from an original record if the entity is not a member ofa specified taint class and there is a path of a given length or lessfrom the derived record to the original record.
 27. The method of claim21, further comprising the step of applying a scoring function toportions of the stored information to which a given entity has access.28. The method of claim 27, wherein results of the scoring functionindicate the relevance of the portions of the stored information to oneor more content customization decisions to be made by the given entity.29. Apparatus for use in a distributed data network wherein a user mayrequest and receive content from one or more entities in the distributeddata network, the apparatus comprising: at least one processor operativeto: (i) maintain two or more user-specified policies respectivelyassociated with two or more roles within which the user may performactivities on the distributed data network; and (ii) issue accesscredentials associated with the user-specified policies to one or moreentities that seek to access information generated in association withthe user's activity on the distributed data network so as to customizecontent to be received by the user in accordance with at least a portionof the accessible information.
 30. The apparatus of claim 29, whereinthe access credentials comprise rights by which the entity may accessthe information.
 31. The apparatus of claim 30, wherein the accessrights comprise at least one of information read rights, informationinsert rights and information delete rights.
 32. The apparatus of claim30, wherein the access credentials further comprise an identifier of theentity to which the access credentials are being issued.
 33. Theapparatus of claim 30, wherein the access credentials further comprisean expiration time specifying a duration of the access rights.
 34. Theapparatus of claim 30, wherein the access credentials further comprise adigital signature on the access credentials.
 35. The apparatus of claim34, wherein the access credentials further comprise a public keymatching a private key by which the access credentials have beendigitally signed.
 36. The apparatus of claim 29, wherein the at leastone processor is further operative to prompt the user to specify a newrole or an existing role within which the user may perform activities onthe distributed data network
 37. Apparatus for use in a distributed datanetwork wherein a user may request and receive content from one or moreentities in the distributed data network, the apparatus comprising: atleast one processor operative to: (i) store information that the one ormore entities learn in association with the user conducting activitieswith the one or more entities; and (ii) enable the one or more entitiesto specify which other entities are able to access the storedinformation so as to customize content to be received by the user inaccordance with at least a portion of the accessible information. 38.The apparatus of claim 37, wherein the information that the one or moreentities learn in association with the user conducting activities withthe one or more entities comprises at least one of original informationand information derived from the original information.
 39. The apparatusof claim 37, wherein the enabling operation further comprises enablingthe one or more entities to specify one or more taint classes forportions of the stored information.
 40. The apparatus of claim 39,wherein a given taint class corresponds to an affinity an entity has tocollaborate with entities in the given taint class.
 41. The apparatus ofclaim 39, wherein at least portions of the information are respectivelystored as records, wherein each record has stored in associationtherewith a data structure comprising at least one of an accumulatedtaint strength, a set of taint classes, and pointers to one or moreoriginal records from which this record was derived.
 42. The apparatusof claim 41, wherein an entity is not permitted to read a record derivedfrom an original record if the entity is not a member of a specifiedtaint class and there is a path of a given length or less from thederived record to the original record.
 43. The apparatus of claim 37,wherein the at least one processor is further operative to apply ascoring function to portions of the stored information to which a givenentity has access.
 44. The apparatus of claim 43, wherein results of thescoring function indicate the relevance of the portions of the storedinformation to one or more content customization decisions to be made bythe given entity.
 45. Apparatus for use in a distributed data networkwherein a user system may request and receive content from one or moreservers associated with entities in the distributed data network, theapparatus comprising: at least one server in the distributed datanetwork operative to: (i) maintain two or more user-specified policiesrespectively associated with two or more roles within which the usersystem may perform activities on the distributed data network; and (ii)issue access credentials associated with the user-specified policies toone or more entity servers that seek to access information generated inassociation with the user system's activity on the distributed datanetwork so as to customize content to be received by the user system inaccordance with at least a portion of the accessible information; andone or more databases in the distributed data network operative to: (i)store information that the one or more entity servers learn inassociation with the user conducting activities with the one or moreentities; and (ii) enable the one or more entities to specify whichother entities are able to access the stored information so as tocustomize content to be received by the user in accordance with at leasta portion of the accessible information.
 46. The apparatus of claim 45,wherein the at least one server is a virtual server.
 47. The apparatusof claim 45, wherein the user system comprises a browser program forrequesting and receiving content.
 48. The apparatus of claim 45, whereinthe one or more entity servers host merchant sites which a user mayselectively visit in accordance with the user system.
 49. The apparatusof claim 45, wherein the distributed data network is the Internet. 50.An article of manufacture for use in accordance with at least one serverin a distributed data network wherein a user may request and receivecontent from one or more entities in the distributed data network, thearticle comprising a machine readable medium containing one or moreprograms which when executed implement the steps of: maintaining two ormore user-specified policies respectively associated with two or moreroles within which the user may perform activities on the distributeddata network; and issuing access credentials associated with theuser-specified policies to one or more entities that seek to accessinformation generated in association with the user's activity on thedistributed data network so as to customize content to be received bythe user in accordance with at least a portion of the accessibleinformation.
 51. An article of manufacture for use in accordance withone or more databases in a distributed data network wherein a user mayrequest and receive content from one or more entities in the distributeddata network, the article comprising a machine readable mediumcontaining one or more programs which when executed implement the stepsof: storing information that the one or more entities learn inassociation with the user conducting activities with the one or moreentities; and enabling the one or more entities to specify which otherentities are able to access the stored information so as to customizecontent to be received by the user in accordance with at least a portionof the accessible information.